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(54) Method and apparatus for providing for notificatk>n of task termination In an information 
handling system 



(57) A method and apparatus for ensuring that a 
process interacting with a failing process is notified of 
the failure of that process. Each process has a unique 
process identifier (RID) associated with it. Each process 
optionally has an affinity list containing one or nrxjre en- 
tries, each of which contains the identifier of a process 
that Is to be notified when the process fails. A process 
updates the affinity list of a target process (either itself 
or another process) by calling an affinity service of the 
operating system (OS) kemel, specifying the type of op- 
eration (add or delete), the Identifier of the target proc- 
ess, the identifier of the process that is to notified, and 
the type of event that is to be generated for the process 
that is to be notified. When a process fails, a process 
termination sen/ice of the OS kemel examines the affin- 
ity list of the failing process and. for each entry in the 
list, generates an event of the specified type for the proc- 
ess specified as to be notified. 
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Description 

[0001] This invention relates to a method and appa- 
ratus for provbing for notification of task termination 
and, more particularly, to a method and apparatus for s 
providing for notification of process termination in a cli- 
ent/server system. 

[0002] Client/server computing systems are well 
known in the art. In a client/server system, a client proc- 
ess (or simply "client") issues a request to a sen/erproc- 
ess (or simply "server"), either on the same system or 
on a different system, to perform a specified service. Up- 
on receiving the request, the server process performs 
the requested sen/ice and returns the result in a re- 
sponse to the client process. 

[0003] When creating a client/server application on a 
single system, there is frequently a need for a client to 
communicate requests to a server and to wait for the 
sen/erto respond. Similarly, there can be multiple server 
processes that need to communicate with multiple client 
processes. If a client is waiting for a response from a 
server and the server terminates, the client process may 
hang in a wait until a user or operator makes a request 
to terminate the client process. Similarly, a server may 
be waiting for a response from a client and have the cli- 
ent terminate. Both client and server can add timer calls 
into their togic to cause the wait to time out, but this can 
cause unnecessary path length and requires the client 
or sen/er application to pick a suitable time period. 
[0004] In UNIX®-based systems, there are several 
programming constructs that can be used to keep track 
of the connection between multiple processes. If an ap- 
plication uses a forkO or spawn() service to create a 
child process, then the two processes are tied together 
by the UNIX framework. That Is, if the child process ter- 
minates, the parent process is sent a SIGCHLD signal. 
If the parent process terminates, the child process is 
sent a SIGHUP signal. However, since interacting send- 
er and client processes are usually not bound together 
by this parent-child relationship, this mechanism is of 
little use as a general notification mechanism in UNIX- 
based systems. 

[0005] According to one aspect of the invention there 
Is provided a method of provkllng for notification of task 
termination in an information handling system having a 
plurality of interacting tasks, the method comprising the 
steps of: defining for each of one or more target tasks 
an affinity list containing one or more entries for other 
tasks that are to be notified on termination of the target 
task; in response to receiving an affinity request speci- 
fying a target task and another task, adding an entry for 
the other task to an affinity list defined for the target task; 
and in response to detecting a tennination of a target 
task, notifying each other task contained in the affinity 
list defined for the target task. 

[0006] According to a second aspect of the invention 
there is provided apparatus for provkJing for the notifi- 
cation of task termination in an infomriatton handling sys- 



tem having a plurality of interacting tasks, the apparatus 
comprising; means for defining for each of one or nrx)re 
target tasks an affinity list containing one or nrtore entries 
for other tasks that are to be notified on tenmlnatbn of 
the target task; means responsive to receiving an affinity 
request specifying a target task and another task for 
adding an entry for the other task to an affinity list de- 
fined for the target task; and means responsive to de- 
tecting a termination of a target task for notifying each 
other task contained in the affinity list defined for the tar- 
get task. 

[0007] According to a third aspect of the invention 
there is provkJed a computer program element conr^ris- 
ing computer program code means executable by the 
computer to: define for each of one or more target tasks 
an affinity list containing one or more entries for other 
tasks that are to bo notified on termination of the target 
task; in response to receiving an affinity request speci- 
fying a target task and ar»other task, add an entry for the 
other task to an affinity list defined for the target task; 
and in response to detecting a termination of a target 
task, rxDtify each other task contained in the affinity list 
defined for the target task. 

[0008] Thus a sdutkjn to the aforennentioned problem 
is provided by the Dkj affinity service of the present In- 
ventbn, described bekDW. The term pki stands for proc- 
ess id. Both the server and client processes have unque 
PIDs. The pkJ affinity sendee Is used to create an affinity 
or bond between the client and sender process, such that 
when one of them terminates, a mechanism is provided 
to drive a signal to notify the other waiting process. 
[0009] As will be described below, each process in the 
operating system optionally has a pkj affinity list that 
kJentifles processes that wish to be notified (via signal) 
when the process tenninates. The pkJ affinity servrce 
provides the mechanism for a client to add its pid to a 
server s pid affinity list or for the sen/er to add its pid to 
the client s pkJ affinity list. It is up to an applicatk>n to 
determine which processes use the pid affinity servce. 
[0010] As an example of the operation of the present 
invention, suppose that a client is about to make a re- 
quest to a server using a nnessage queue. Prtor to plac- 
ing the request on the sen/er input queue (by issurig a 
msgsnd system call), the client calls the pki affinity sen/- 
tee to add its pid to the pid affinity list of the server. The 
client then issues a msgsnd system call to place the re- 
quest on the server input queue. The client then issues 
a msgrcv system call to wait for a response from the 
sen/er. While in this message queue wait, the server 
may terminate. If this happens, the kernel will see the 
pkJ affinity list and send a signal to each process (rep- 
resented by a pkJ) that Is on the pid affinity list. The signal 
wilt wake up the client process from the msgrcv wait and 
allow it to fail the current request and return control to 
the calling process. 

[0011] The above description of a client/sen/er com- 
munication using message queues is simply one exam- 
ple of how processes may communicate. They coukJ al- 
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SO use shared memory, sefTinphores or any other com- 
munication mechanism. This example also described 
the client and server as simple single-threaded process- 
es. It is possible for a server to be multithreaded and 
handling many requests concurrently from multiple cli- 
ents. If such a server were to terminate, it would cause 
the notification of all the clients in its pid affinity list. 
[0012] A preferred embodiment of the invention will 
now be described, by way of example only, with refer- 
ence to the accompanying drawings in which: 

Fig. 1 shows the parameter list passed to the pid 
affinity service, a process information block and the 
pid affinity list; 

Fig. 2 shows the fbw and logic for adding another 
process PID to its own PID affinity list and the result 
of termination of the calling process; 

Fig. 3 shows the flow and logic tor adding the caller 
s PID to the PID affinity list ot a target process and 
the actions triggered by the termination of that tar- 
get process; 

Fig. 4 shows the entry to the PID affinity service as 
well as the delete entry processing; 

Fig. 5 shows the add entry logic of the PID affinity 
service. 

[0013] Referring first to Figs. 2-3. an embodiment of 
the present invention contains a PID affinity service 221 
in the kernel address space 204 of a system also having 
one or rrrare user address spaces 202 including proc- 
esses 206 (process A) and 216 (process B). Kernel ad- 
dress space 204 is part of an operating system (OS) ker- 
nel (not separately shown) running, together with one 
or more user programs in user address spaces 202, on 
a general-purpose computer having a central process- 
ing unit (CPU), nr^in and secondary storage, and vari- 
ous peripheral devices that are conventtonal in the art 
and therefore not shown. Although the present invention 
is not limited to any particular hardware or software plat- 
fonm, a preferred embodiment may be Implemented as 
part of the IBf^ 08/390® operating system, running 
on an IBM S/3904. 22® processor such as an S/390 Par- 
allel Enterprise Sender™ G4 or G5 processor. 
[0014] Referring now to Fig. 1, each process in the 
system has a process information block (PIB) 1 1 4 asso- 
ciated with it. Each PIB 114 contains a process id (PID) 
116 uniquely identifying the process and a pointer 118 
to a PI D affinity list (PAL) 1 20, as well as other itenr^ that 
are not related to the present invention and are therefore 
not shown. For each call to the pid affinity service 221 
to add a PID to the list 120, an entry 122 is made in the 
list. Each entry 1 22 contains the PID 1 24 of the process 
to be notified of an event and an event type 126, whk)h 
could be a signal number. 



[0015] A pid affinity parameter list ;PL) 100 contains 
the parameters specified by an applk^iion program as 
Input to the pid affinity service 221. as well as output 
from the pid affinity service 221. TTiese parameters in- 

5 dude a function code 102. a target process parameter 
104, an event process parameter 106, an event param- 
eter 108 and a return code 110. Parameters 102-108 
are input parameters supplied by the calling applicatkxi 
to the PID affinity sen^ice 221 . while return code 11 0 is 

10 an output parameter retumed by the PID affinity service 
221 to the calling application. 
[0016] The function code 102 specifies which pid af- 
finity service function is requested by the application 
program. Supported function codes 102 are adding an 

t5 entry 1 22 to an affinity list 1 20 and deleting an entry 1 22 
from an affinity list 120. 

[0017] The target process parameter 104 specifies 
the target process (as identified by its PID) whose affin- 
ity list 1 20 is the target of the operation specified by the 

20 function code paranneter 102. 

[0018] The event process parameter 106 specified 
has different uses based upon the function code param- 
eter 102 specified. The event process 106 identifies the 
process that is to be delivered the event when the target 

25 process terminates. When an applcation specifies a 
function code 102 to add an entry 122 to an affinity list 
120. the contents of this parameter 106 are copied into 
an entry 124 in the affinity list 120 of the process spec- 
ified by the target process parameter 104. When an ap- 

30 plication specifies the function code parameter 102 to 
delete an entry 1 22 from an affinity list 1 20. the contents 
of this parameter 1 06 are compared with existing entries 
1 24 in the affinity list 1 20 of the process specified by the 
target process parameter 104. If an entry 122 with a 

3S matching process identifier 124 is found, it is cleared 
and is available to be reused. 

[0019] The event parameter 108 specifies the event 
126 to be generated when the target process 104 ter- 
minates. This parameter 108 is unused when the func- 

4o tion code parameter 102 requests deletbn of an entry 
122. When the function code paranr>eter 102 specifies 
adding an entry 122 to an affinity list 120, the contents 
of this parameter 108 are copied to an entry 126 in the 
affinrty list 120 of the process specified by the target 

45 process parameter 104. 

[0020] The fifth parameter 110 contains the return 
code generated by the pid affinity service. It is used to 
indicate the success or failure of the pkJ affinity service 
to the application program. 

so [0021] Fig. 2 shows the usage of the pkJ affinity sen^- 
ice 221 when a client program adds its PID to the pkl 
affinity list 120 of a server. Fig. 2 shows user address 
spaces 202 and a kernel address space 204. The kernel 
address space 204 is where services are provkJed that 

ss allow applcations to connmunicate with other user ad- 
dress spaces 202. In this example, user address spaces 
202 include a client address space 206 (process A) that 
is communicating with a server address space 216 
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(process B) 

[0022] The client 206 initially assigns a work request 
to the server 216 (step 208). As discussed earlier, one 
means of doing this is by placing a message on a mes- 
sage queue. After assigning the work request at step 5 
208. the client 206 waits for a response from the server 
2 1 6 by invoking a wait function 21 2 In the kernel address 
space 204 (step 210). The wait function 212 could be a 
general-purpose wait function, or it coukJ be a function 
like msgrcv that waits for a message or a signal. This is 
standard programming practice on UNIX systems. As 
described, for example, in w. R. Stevens, UNIX Network 
Programming, 1990. pages 126-137. incorporated 
herein by reference, in a UNIX system a process wishing 
to send a message to another process nnay issue a msg- 
snd system call to place a message in a message 
queue. That other process may in turn issue a nnsgrcv 
system call to retrieve the message from the message 
queue. 

[0023] In the server space 216, shown as process B, 
the server receives the work request from the client 206 
(step 218). This could be accomplished using a function 
like msgrcv to receive a message placed on a message 
queue by the client 206 at step 208. After receiving the 
work request at step 218, the server 216 calls the pkJ 
affinity servce (pid_affinily) 221 of the present invention 
with a function code 102 to add, a target process PID 
104 of process B (itself), an event process 106 of proc- 
ess A, and an event 108 whk;h could be a particular sig- 
nal (step 220). Once this step is completed. shouW an- 
ything happen to terminate the server process 216, the 
client process 206 is guaranteed to be notified with the 
requested event 108. 

[0024] Next, server 216 processes the work request 
assigned at step 208 (step 220). Assuming no errors oc- 
cur, the server 216 processes the work request (step 
222) and then notifies the client 206 of the completion 
(step 226). This could be accomplished by sending a 
message to the client 206 with the results of the work 
request. The msgsnd by the server 216 wouW wake up 
the client 206 in a msgrcv wait 21 2. After notifying client 
process 206 at step 226, the server 216 calls the pki 
affinity service 221 with functon code 102 to delete an 
entry 122 In the PID affinity list 120 tor a target process 
104 set to process B 216 (itself) (step 228). The event 
process 106 is set to process A 206. After the pid affinity 
service 221 completes the request, the entry 1 22 for the 
client process 206 is removed from the PID affinity list 
120 for the server process 216. 
[0025] During this server processing, suppose a ter- 
minating event 224 occurs, which prevents the server 
from completing the work request at step 226. In this 
case, the kernel 204 gets control in process termination 
230. As part of process termination 230, the kernel han- 
dles any entries 122 in the PID affinity list 120 for the 
terminating process 216. If an entry in the PID affinity 
list 120 is filled in (step 232), then the kernel generates 
the event 126 and targets this event to the PID 124 in 



the entry 122 of the PID affinity list 120 (step 234). 
[0026] The generatton of the event at step 234 causes 
the target process 206 to be resumed from its wait con- 
dition 212 (step 236) and triggers the delivery of the ab- 
normal event 1 26 to an event exit 238 of process A 206. 
The client code in the event exit 238 is notified of the 
termination of the sender 216 (process B) from which it 
was awaiting a response (step 240). The client event 
exit 238 can then decide whether to terminate or retry 
the request. What the client does when notified is not 
part of the present invent k)n and is therefore not de- 
scribed.. 

[0027] Fig. 3 shows another nrxxJel supported by the 
pid affinity service 221 . In this case, a client process 302 
(process C) determines the PID of a sen/er process 320 
(process D) with which it will soon communicate. This 
may be accomplished with shared memory, configura- 
tion files or other means not related to the present in- 
ventbn. The client process 302 then calls the pid affinity 
servrce 221 with a function code 102 of add, a target 
process 104 PID for process D 320 (the server), an 
event process 106 set to process C 302 (the client, itself) 
and the event 108 it wishes to receive if the server 320 
terminates while processing its request (step 304). 
[0028] TTie client 302 then assigns work to the server 
process 320 via a message queue or other commun ca- 
tion mechanism (step 306). The client 302 then calls the 
wait sen^ice 212 to wait for a response from the server 
320 (step 308). The wait service 212 puts the client 302 
to sleep until the requested function completes or an ab- 
normal event is received. 

[0029] In the meantime, the server 320 has received 
the work request (step 322) and is processing the work 
(step 324). If all works successfully, the server 320 no- 
tifies the client 302 when the work connpletes (step 328). 
This notification at step 328 causes the client process 
302 to exit the wait function 21 2 with a successful return 
code. Upon receiving control back from wart, the client 
302 calls the pid affinity service 221 to undo the call 
made at step 304 (step 310). This call at step 310 will 
set the function code 102 to request delete, the target 
process 104 will identify sender process D 320 and the 
event process 106 will identify this client 302. 
[0030] If a temninating event 326 hits the server 320, 
then it will trigger the process temnlnation service 
(processjemn) 230. Process termination sery/ice 230 
will run through the PID affinity list 120 for server proc- 
ess D 320 and for each entry in the PID affinity list (step 
232), it will generate 224 the requested event 1 26 to the 
target PID 124 (step 234). In this case, the target PID 
124 kjentifies client process C 302 and the event 126 is 
what was passed in the event parameter 108 in step 
304. 

[0031] When the event is generated at step 234, it 
causes client process C 302 to be taken out of the wait 
212 with an interrupt (step 340). The wait function 212. 
instead of returning to the caller after step 308, now 
passes control to the event exit 311 . The event exit 311 
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is notified of the termination of server process D (step 
312). At this point, the client code 3C2 can either termi- 
nate, retry the request or request a different service. 
[0032] Fig. 4 shows the processing of the PID affinity 
service 221 . On entry, the service 221 validates the call- 
er s parameters (step 402). If the function code 102, tar- 
get process PID 104. event process PID 106. or event 
108 is invalid, then the sen^ice 221 sets a unique failing 
return code (step 404) and returns to the caller (step 
406). Assuming all parameters are valid, the service 221 
obtains a process lock for the target process 104 (step 
408). This lock serializes updates to the PID affinity list 
1 20 (here after referred to as PAL) of the target process 
104 for multiple callers. 

[0033] If the target process 104 does not yet have a 
PAL 1 20 (step 410), then storage is obtained for the PAL 
1 20 and the location of the PAL 1 20 is stored in the Proc- 
ess Infonmation Block (PIB) 114 in field 116 (step 412). 
Next the function code 1 02 is tested to determine wheth- 
er add or delete processing is requested (step 416). If 
add processing is requested, processing is as described 
in Fig. 5 (step 418). 

[0034] For delete processing, the PAL 1 20 is scanned 
for an entry 122 that has a PID 124 that matches the 
event process PID 106 passed as input (step 414). If a 
matching entry 122 is found (step 420), then the entry 
122 is cleared and the last entry 122 in the PAL 120 is 
moved to the cleared entry to keep the table packed 
(step 422). The process lock is then released and con- 
trol is returned to the caller (step 406). If the entry 122 
is not found, then the process kxk is released and con- 
trol is returned to the caller without performing the de- 
letbn step 422 (step 406). 

[0035] Fig. 5 shows the processing to add an entry to 
the PAL 120. The target process PID 104 is tested (step 
602) to determine if it is the same as the caller s PID 
116. If they match, it means that if the calling process 
terminates, it will cause a signal (event 108) to be sent 
to the event process 106. Before adding the entry 122 
to the PAL 1 20, a test is made to determine if the calling 
process is allowed to send a signal (event 108) to the 
event process 106 (step 504). If the caller is not permit- 
ted to send the signal (event 108), then the service sets 
an error code (step 508), releases the process lock and 
returns to the caller (step 518). 

[0036] Once past the initial tests, the code loops 
through the PAL 120 (step 506). Looking at an entry 1 22 
in the PAL 1 20, if the current PID 124 is the same as the 
event PID 108 (step 510), then this entry 122 is overlaid 
by storing the event pid 106 over the PID in the entry 
124 and the event 108 over the event 126 in the entry 
1 22 (step 51 2). If the PIDs don t match at step 510, then 
if there are more entries in the PAL 120 (step 514), the 
loop continues at step 506. 

[0037] If the event PID 106 is not found in the PAL, 
then a new entry 122 is chosen. This will normally just 
use the next unused entry 122 in the PAL 1 20, If the PAL 
1 20 is f u II, a new larger PAL is obtained, the oW PAL 1 20 



is copied into the new PAL antJ thr cddross of the new 
PAL is stored in the PIB 11 4 in fie.d 1 1 6. Since the proc- 
ess is locked (step 408). this can be done safely. After 
copying the old PAL to the new PAL. the old PAL is freed. 

5 The new entry is then stored as in step 512 using an 
unused entry 122 in the PAL The process lock is re- 
leased and control is returned to the caller (step 518). 
[0QZ8\ Although a particular embodiment of the inven- 
tion has been shown and described, various modtfica- 

10 tions and extensions within the scope of the appended 
claims will be apparent to those skilled in the art. 



Claims 

IS 

1 . A method of providing for notificatbn of task termi- 
nation in an informatbn handling system having a 
plurality of interacting tasks, the method comprising 
the steps of: 

20 

defining for each of one or more target tasks an 
affinity list containing one or more entries for 
other tasks that are to be notified on termination 
of the target task; 

25 

in response to receiving an affinity request 
specifying a target task and another task, add- 
ing an entry for the other task to an affinity list 
defined for the target task; and 

30 

in response to detecting a termination of a tar- 
get task, notifying each other task contained in 
the affinity list defined for the target task 

3S 2. The method of claim 1 in which the affinity request 
originates from the target task. 

3. The oiethod of claim 1 in which the affinity request 
originates from the other task. 

40 

4. The method of claim 1 in which the affinity request 
is of a first type, the method comprising the further 
step of: 

in response to receiving an affinity request of 
45 a second type specifying a target task and another 
task, deleting an entry for the other task from the 
affinity list defined for the target task. 

5. The method of any preceding claim in whrch the 
50 adding step comprises the steps of: 

determining whether an affinity list is already 
defined for the target task; 

ss if an affinity list is already defined for the target 

task, adding an entry for the other task to the 
affinity list defined for the target task; and 
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if an affinity list is not already defined for the 
target task, defining an affinity list for the target 
task and adding an entry for the other task to 
the affinity list defined for the target task. 

5 

6. The method of any preceding claim in which the 
tasks are processes having separate address spac- 
es. 

7. TTie method of claim 1 in which the tasks are user io 
tasks and the steps are performed by an operating 
system kernel. 

8. The method of claim 1 in which the affinity request 
specifies a type of operation to be performed on the 
affinity list defined for the target process. 



and 

means for defining an affinity list for the target 
task and adding an entry for the other task to 
the affinity list defined for the target task if an 
affinity list is not already defined for the target 
task. 

1 3. A computer program element comprising connputer 
program code means executable by the computer 
to: 

define for each of one or more target tasks an 
affinity list containing one or more entries for 
other tasks that are to be notified on termination 
of the target task; 



9. The method of claim 1 in which the affinity request 
specifies an event to be generated for the other task 
upon termination of the target task. 

10, Apparatus for provkJIng for the notificatton of task 
termination in an information handling system hav- 
ing a plurality of interacting tasks, the apparatus 
comprising: 



in response to receiving an affinity request 
specifying a target task and another task, add 
20 an entry for the other task to an affinity list do- 

fined for the target task; and 

in response to detecting a terminatran of a tar- 
get task, notify each other task contained in the 
25 affinity list defined for the target task. 



means for defining for each of one or more tar- 
get tasks an affinity list containing one or more 
entries for other tasks that are to be notified on 
termination of the target task; 

means responsive to receiving an affinity re- 
quest specifying a target task and another task 
for adding an entry for the other task to an af- 
finity list defined for the target task; and 

means responsive to detecting a termination of 
a target task for notifying each other task con- 
tained in the affinity list defined for the target 
task. 



L The computer program element of claim 1 3 in whch 
the affinity request is of a first type, further compris- 
ing computer program code means executable by 
the computer to: 

in response to receiving an affinity request of 
a second type specifying a target task and another 
task, delete an entry for the other task from the af- 
finity list defined for the target task 

K The computer program element of claim 1 3 or claim 
14 embodied on a computer readable medium. 



11. The apparatus of claim 10 in whk:h the affinity re- 
quest is of a first type, the apparatus further com- 
prising: 

means responsive to receiving an affinity re- 
quest of a second type specifying a target task and 
another task for deleting an entry for the other task 
from the affinity list defined for the target task. 



12. The apparatus of claim 1 in whrch the adding means so 
comprises: 



means for determining whether an affinity list is 
already defined for the target task; 

means for adding an entry for the other task to 
the affinity list defined for the target task if an 
affinity list is already defined for the target task; 
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